{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "神经网络由在数据上进行操作的层/模块构成。torch.nn 命名空间提供了所有你用来构建你自己的神经网络所需的组件。PyTorch 中每个模块都是 nn.Module 的子类。一个由其他模块(层)组成的神经网络自身也是一个模块。这种嵌套的结构让构建和管理复杂的结构更轻松。\n",
    "\n",
    "在下面的章节中，我们将构建一个神经网络来给 FashionMNIST 数据集的图片分类。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "outputs": [],
   "source": [
    "import os\n",
    "import torch\n",
    "from torch import nn\n",
    "from torch.utils.data import DataLoader\n",
    "from torchvision import datasets, transforms"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 获取训练的设备\n",
    "我们希望能够在一个硬件加速设备比如 GPU 或者 MPS 上(如果有的话)训练我们的模型。让我们检查 torch.cuda 和 torch.backend.mps 是否可用，否则我们使用 CPU。"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Using cpu device\n"
     ]
    }
   ],
   "source": [
    "device = ('cuda'\n",
    "          if torch.cuda.is_available()\n",
    "          else 'mps'\n",
    "if torch.backends.mps.is_available()\n",
    "else 'cpu'\n",
    "          )\n",
    "print(f\"Using {device} device\")"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 定义类\n",
    "\n",
    "我们通过子类化 nn.Module 来定义我们的神经网络，并在 __init__ 中初始化神经网络。每个 nn.Module 子类都会在 forward 方法中实现对输入数据的操作。"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "outputs": [],
   "source": [
    "class NeuralNetwork(nn.Module):\n",
    "    def __init__(self):\n",
    "        super(NeuralNetwork, self).__init__()\n",
    "        self.flatten = nn.Flatten()\n",
    "        self.linear_relu_stack = nn.Sequential(\n",
    "            nn.Linear(28 * 28, 512),\n",
    "            nn.ReLU(),\n",
    "            nn.Linear(512, 512),\n",
    "            nn.ReLU(),\n",
    "            nn.Linear(512, 10)\n",
    "        )\n",
    "\n",
    "    def forward(self, x):\n",
    "        x = self.flatten(x)\n",
    "        logits = self.linear_relu_stack(x)\n",
    "        return  logits"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "NeuralNetwork(\n",
      "  (flatten): Flatten(start_dim=1, end_dim=-1)\n",
      "  (linear_relu_stack): Sequential(\n",
      "    (0): Linear(in_features=784, out_features=512, bias=True)\n",
      "    (1): ReLU()\n",
      "    (2): Linear(in_features=512, out_features=512, bias=True)\n",
      "    (3): ReLU()\n",
      "    (4): Linear(in_features=512, out_features=10, bias=True)\n",
      "  )\n",
      ")\n"
     ]
    }
   ],
   "source": [
    "# 我们创建一个 NeuralNetwork 实例，并将它发送到 device ，然后打印它的结构.\n",
    "\n",
    "\n",
    "model = NeuralNetwork().to(device)\n",
    "print(model)\n"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "为了使用这个模型，我们给它传递输入数据。这将会执行模型的 forward 函数，以及一些后台操作。请不要直接调用 model.forward() !\n",
    "\n",
    "将数据传递给模型并调用后返回一个 2 维tensor(第0维对应一组 10 个代表每种类型的原始预测值，第1维对应该类型对应的原始预测值)。我们将它传递给一个 nn.Softmax 模块的实例来来获得预测概率。\n",
    "\n"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Predicted class:tensor([6])\n"
     ]
    }
   ],
   "source": [
    "X = torch.rand(1, 28, 28, device=device)\n",
    "logits = model(X)\n",
    "pred_probab = nn.Softmax(dim=1)(logits)\n",
    "y_pred = pred_probab.argmax(1)\n",
    "print(f\"Predicted class:{y_pred}\")"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 模型层\n",
    "\n",
    "让我们分析这个 FashionMNIST 模型的各层。为了说明，我们会取一个由 3 张 28x28 的图片数据组成的样例数据\n",
    "\n"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "torch.Size([3, 28, 28])\n"
     ]
    }
   ],
   "source": [
    "input_image = torch.rand(3, 28, 28)\n",
    "print(input_image.size())"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "torch.Size([3, 784])\n"
     ]
    }
   ],
   "source": [
    "# nn.Flatten:将每个 2 维的 28x28 图像转换成一个包含 784 像素值的连续数组(微批数据的维度(第0维)保留了).\n",
    "flatten = nn.Flatten()\n",
    "flat_image = flatten(input_image)\n",
    "print(flat_image.size())"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "torch.Size([3, 20])\n"
     ]
    }
   ],
   "source": [
    "# nn.Linear:对输入值使用自己存储的权重 (w) 和偏差 (b) 来做线性转换\n",
    "layer1 = nn.Linear(in_features=28*28, out_features=20)\n",
    "hidden1 = layer1(flat_image)\n",
    "print(hidden1.size())"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[0.4715, 0.0000, 0.0000, 0.0000, 0.0697, 0.0000, 0.0000, 0.4946, 0.0330,\n",
      "         0.6178, 0.5230, 0.0000, 0.5157, 0.0000, 0.0000, 0.0000, 0.0000, 0.2668,\n",
      "         0.0000, 0.0000],\n",
      "        [0.3136, 0.1050, 0.0000, 0.0000, 0.0000, 0.0000, 0.1170, 0.5384, 0.3865,\n",
      "         0.0565, 0.2753, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.5495,\n",
      "         0.0000, 0.0000],\n",
      "        [0.4215, 0.3972, 0.0000, 0.0000, 0.0000, 0.0000, 0.1507, 0.7593, 0.1153,\n",
      "         0.1234, 0.0066, 0.0000, 0.3669, 0.0000, 0.0000, 0.0000, 0.0000, 0.4761,\n",
      "         0.0000, 0.0000]], grad_fn=<ReluBackward0>)\n"
     ]
    }
   ],
   "source": [
    "# nn.ReLU:非线性的激活函在模型的输入和输出之间数创造了复杂的映射关系。它们在线性转换之后引入非线性，帮助神经网络学习更广阔范围的现象。\n",
    "hidden1 = nn.ReLU()(hidden1)\n",
    "print(hidden1)"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "outputs": [],
   "source": [
    "# nn.Sequential 是一个模块的有序容器。数据会沿着模块定义的顺序流动。\n",
    "seq_modules = nn.Sequential(\n",
    "    flatten,\n",
    "    layer1,\n",
    "    nn.ReLU(),\n",
    "    nn.Linear(20, 10)\n",
    ")\n",
    "input_image = torch.rand(3, 28, 28)\n",
    "logits = seq_modules(input_image)"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "outputs": [],
   "source": [
    " # nn.Softmax 模块。这些 logits 值被缩放到 [0,1]，代表模型对每种类型的预测概率， dim 参数代表沿着该维度数值应该加总为 1.\n",
    "softmax = nn.Softmax(dim=1)\n",
    "pred_probab = softmax(logits)"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 模型参数\n",
    "\n",
    "神经网络内的许多层都是参数化的，比如可以在训练中与层相关的可优化权重和偏置。 nn.Module 的子类会自动追踪所有你定义在模型对象中的字段，并通过你模型的 parameters() 或者 named_parameters() 方法访问所有参数。"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Model structure: NeuralNetwork(\n",
      "  (flatten): Flatten(start_dim=1, end_dim=-1)\n",
      "  (linear_relu_stack): Sequential(\n",
      "    (0): Linear(in_features=784, out_features=512, bias=True)\n",
      "    (1): ReLU()\n",
      "    (2): Linear(in_features=512, out_features=512, bias=True)\n",
      "    (3): ReLU()\n",
      "    (4): Linear(in_features=512, out_features=10, bias=True)\n",
      "  )\n",
      ")\n",
      "\n",
      "\n",
      "Layer: linear_relu_stack.0.weight | Size: torch.Size([512, 784]) | Values : tensor([[-0.0220,  0.0311,  0.0079,  ..., -0.0298, -0.0065,  0.0287],\n",
      "        [ 0.0275,  0.0313, -0.0344,  ..., -0.0003, -0.0203, -0.0284]],\n",
      "       grad_fn=<SliceBackward0>) \n",
      "\n",
      "Layer: linear_relu_stack.0.bias | Size: torch.Size([512]) | Values : tensor([-0.0049, -0.0211], grad_fn=<SliceBackward0>) \n",
      "\n",
      "Layer: linear_relu_stack.2.weight | Size: torch.Size([512, 512]) | Values : tensor([[-0.0251,  0.0154,  0.0066,  ...,  0.0403,  0.0082, -0.0194],\n",
      "        [ 0.0334,  0.0036,  0.0218,  ..., -0.0120, -0.0166, -0.0073]],\n",
      "       grad_fn=<SliceBackward0>) \n",
      "\n",
      "Layer: linear_relu_stack.2.bias | Size: torch.Size([512]) | Values : tensor([-0.0030,  0.0143], grad_fn=<SliceBackward0>) \n",
      "\n",
      "Layer: linear_relu_stack.4.weight | Size: torch.Size([10, 512]) | Values : tensor([[-0.0025, -0.0049,  0.0384,  ...,  0.0323,  0.0145,  0.0176],\n",
      "        [-0.0400,  0.0182, -0.0145,  ...,  0.0207, -0.0319, -0.0366]],\n",
      "       grad_fn=<SliceBackward0>) \n",
      "\n",
      "Layer: linear_relu_stack.4.bias | Size: torch.Size([10]) | Values : tensor([-0.0097, -0.0264], grad_fn=<SliceBackward0>) \n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(f\"Model structure: {model}\\n\\n\")\n",
    "\n",
    "for name, param in model.named_parameters():\n",
    "    print(f\"Layer: {name} | Size: {param.size()} | Values : {param[:2]} \\n\")"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}